Compiler-Directed Static Classification of Value Locality Behavior

نویسندگان

  • Qing Zhao
  • David J. Lilja
چکیده

Predicting the values that are likely to be produced by instructions has been suggested as a way of increasing the instruction-level parallelism available in a wide-issue processor. One of the potential difficulties in exploiting the predictability of values, however, is selecting the proper type of predictor, such as a last-value predictor, a stride predictor, or a context-based predictor, for a given instruction. We propose a compiler-directed classification scheme that statically partitions all of the instructions in a program into several groups, each of which is associated with a specific value predictability pattern. This value predictability pattern is encoded into the instructions to identify the type of value predictor that will be best suited for predicting the values that are likely to be produced by each instruction at run-time. Both an idealized profile-based compiler implementation and an implementation based on the GCC compiler are studied to show the performance bounds for the proposed technique. Our simulations based on the SimpleScalar tool set and the SPEC95 integer benchmarks indicate that this approach can substantially reduce the number of read/write ports needed in the value predictor for a given level of performance. This static partitioning approach also produces better performance than a dynamically partitioned approach for a given hardware configuration. Finally, this work demonstrates the connection between value locality behavior and source-level program structures thereby leading to a deeper understanding of the causes of this behavior.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mapping Using Static Call Graph EstimationAmir

As the gap between memory and processor performance continues to grow, it becomes increasingly important to exploit cache memory eeectively. One technique used by compiler and linkers to improve the performance of the cache is code reordering. Code reordering optimizations rearrange a program so that sections of the program with temporal locality will be placed next to each other in the nal pro...

متن کامل

Mapping Using Static Call Graph Estimation

As the gap between memory and processor performance continues to grow, it becomes increasingly important to exploit cache memory e ectively. One technique used by compiler and linkers to improve the performance of the cache is code reordering. Code reordering optimizations rearrange a program so that sections of the program with temporal locality will be placed next to each other in the nal pro...

متن کامل

Heap Data Allocation to Scratch-Pad Memory in Embedded Systems

Title of dissertation: HEAP DATA ALLOCATION TO SCRATCH-PAD MEMORY IN EMBEDDED SYSTEMS Angel Dominguez Doctor of Philosophy, 2007 Dissertation directed by: Professor Rajeev K. Barua Department of Electrical and Computer Engineering This thesis presents the first-ever compile-time method for allocating a portion of a program’s dynamic data to scratch-pad memory. A scratch-pad is a fast directly a...

متن کامل

A Graph Based Framework to Detect Optimal Memory Layouts for Improving Data Locality

In order to extract high levels of performance from modern parallel architectures, the effective management of deep memory hierarchies is very important. While architectural advances in caches help in better utilization of the memory hierarchy, compiler-directed locality enhancement techniques are also important. In this paper we propose a locality improvement technique that uses data space (ar...

متن کامل

Layout Transformations for Heap Objects Using Static Access Patterns

As the amount of data used by programs increases due to the growth of hardware storage capacity and computing power, efficient memory usage becomes a key factor for performance. Since modern applications heavily use structures allocated in the heap, this paper proposes an efficient structure layout based on static analyses. Unlike most of the previous work, our approach is an entirely static tr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000